INSA LYON and UNI PASSAU's Participation at PAN@CLEF'17: Author Profiling task

نویسندگان

  • Guillaume Kheng
  • Léa Laporte
  • Michael Granitzer
چکیده

This paper describes the participation of INSA Lyon and UNI Passau at the PAN 2017 Author Profiling task. Given the language and tweets from an author, the goal is to predict his/her gender and language variety. We consider two strategies : a "loose" classification that learns one predictive model for the gender and another one for the variety, and a "successive" classification that first predict the gender then learn a predictive model for variety, given the gender. We consider all the languages. We experiment various features representations and machine learning algorithms used in previous PAN Author Profiling editions in order to learn the models. We adapt the features and machine learning algorithm used for each language and each classification task by selecting the configuration that provides the best results in terms of prediction performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Style-based Distance Features for Author Verification Notebook for PAN at CLEF 2013

In this paper we present the approach we took in our participation to the PAN 2013 Author Profiling task. It is an adaptation of our system submitted for author identification, assuming that a profile category (authors belonging to the same gender and age group categories) can be analyzed in the same way as an author’s style.

متن کامل

Style-based Distance Features for Author Profiling Notebook for PAN at CLEF 2013

In this paper we present the approach we took in our participation to the PAN 2013 Author Identification task. It relies on a complex process to select the features which represent the author’s writing, using potentially multiple statistics and distance measures computed from the training set.

متن کامل

Readability for Author Profiling? Notebook for PAN at CLEF 2013

This paper briefly describes the approach taken to the Author Profiling task at PAN 13. It describes the simple features used, and the origins in thinking around text readability as a mechanism for identification, and the predictive model used which may have beneficially omitted classes, as well as offering commentary on the results obtained.

متن کامل

INAOE's Participation at PAN'13: Author Profiling Task Notebook for PAN at CLEF 2013

This paper describes the participation of the Laboratory of Language Technologies of INAOE at PAN 2013 evaluation lab. We adopted second order representations for facing the problem of Author Profiling (AP). This representation tackles two shortcomings of the typical Bag-of-Terms: i) the sparsity and high dimensionality of document representations, and ii) the assumption of total independence b...

متن کامل

Segmenting Target Audiences: Automatic Author Profiling using Tweets: Notebook for PAN at CLEF 2015

This paper describes a methodology proposed for author profiling using natural language processing and machine learning techniques. We used lexical information in the learning process. For those languages without lexicons, we automatically translated them, in order to be able to use this information. Finally, we will discuss how we applied this methodology to the 3rd Author Profiling Task at PA...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017